Effect of the mutation rate and background size on the quality of pathogen identification
نویسندگان
چکیده
MOTIVATION Genomic-based methods have significant potential for fast and accurate identification of organisms or even genes of interest in complex environmental samples (air, water, soil, food, etc.), especially when isolation of the target organism cannot be performed by a variety of reasons. Despite this potential, the presence of the unknown, variable and usually large quantities of background DNA can cause interference resulting in false positive outcomes. RESULTS In order to estimate how the genomic diversity of the background (total length of all of the different genomes present in the background), target length and target mutation rate affect the probability of misidentifications, we introduce a mathematical definition for the quality of an individual signature in the presence of a background based on its length and number of mismatches needed to transform the signature into the closest subsequence present in the background. This definition, in conjunction with a probabilistic framework, allows one to predict the minimal signature length required to identify the target in the presence of different sizes of backgrounds and the effect of the target's mutation rate on the quality of its identification. The model assumptions and predictions were validated using both Monte Carlo simulations and real genomic data examples. The proposed model can be used to determine appropriate signature lengths for various combinations of target and background genome sizes. It also predicted that any genomic signatures will be unable to identify target if its mutation rate is >5%. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
منابع مشابه
Identification of a Novel Arylsulfatase B Gene Mutation in Three Unrelated Iranian Mucopolysaccharidosis Type-VI Patients with Different Phenotype Severity
Background: Mucopolysaccharidosis type-VI (MPS-VI), which is inherited as an autosomal recessive trait, results from the deficiency of N-acetylgalactosamine 4-sulfatase (arylsulfatase B) activity and the lysosomal accumulation of dermatan sulfate. In this study, ARSB mutation analysis was performed on three unrelated patients who were originally from the West Azerbaijan province of Iran. Method...
متن کاملA Novel Experimental Analysis of the Minimum Cost Flow Problem
In the GA approach the parameters that influence its performance include population size, crossover rate and mutation rate. Genetic algorithms are suitable for traversing large search spaces since they can do this relatively fast and because the mutation operator diverts the method away from local optima, which will tend to become more common as the search space increases in size. GA’s are base...
متن کاملتأثیر اشعهی فرابنفشB بر برخی خصوصیات فیزیولوژیک و قدرت بیماریزایی در کاندیدا آلبیکنس
Background and Aim: Candida albicans, the most common human fungal commensal pathogen, is a normal member of the human microbiota which can colonize the oral cavity, vagina and gastrointestinal tract. This opportunistic pathogen can cause diseases ranging from mucosal infections to systemic mycoses, depending on the vulnerability and weakness of the immune system of the host. In addition, i...
متن کاملEffect of Mutation in Efflux Pump Regulatory Protein (MexR) of Pseudomonas aeruginosa: A Bioinformatic Study
ABSTRACT Background and Objectives: Pseudomonas aeruginosa is an important non-fermenting gram-negative hospital-acquired pathogen. Treatment of P. aeruginosa infections has become more challenging due to overexpression of efflux pumps. The aim of the present study was to apply in silico analysis to evaluate the structure of the effl...
متن کاملفراتحلیل اثربخشی آموزشها و مداخلات روانشناختی و ورزشی بر میزان کیفیت زندگی بیماران مبتلا به دیابت نوع دو (ایران: 1392- 1382)
Background: One of the important indicators of diabetes treatment and control is enhancement of quality of life in patients with diabetes. Therefore, in recent years, quality of life in these patients regarded by therapists and researchers and increased studies in this field. The aim of this study was collection and integration of these studies results to investigate the effect size of sport an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 23 20 شماره
صفحات -
تاریخ انتشار 2007